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Abstract 

This paper presents a concatenated turbo coding system in which a Reed-Solomoo 
outer code is concatenated with a binary turbo inner code. In the proposed system, the 
outer code decoder and the inner turbo code decoder interact to achieve both good bit 
error and frame error performances. The outer code decoder helps the inner turbo code 
decoder to terminate its decoding iteration while the inner turbo code decoder provides 
soft-output information to the outer code decoder to carry out a reliability-based soft- 
decision decoding. In the case that the outer code decoding fails, the outer code decoder 
instructs the inner code decoder to continue its decoding iterations until the outer code 
decoding is successful or a preset maximum number of decoding iterations is reached. 
This interaction between outer and inner code decoders reduces decoding delay. Also 
presented in the paper are an effective criterion for stopping the iteration process of 
the inner code decoder and a new reliability- based decoding algorithm for nonbinary 
codes. 


•This research was supported by NSF under Grants NCR 94-15374, CCR 97-32959, CCR 98-14054 and 
NASA under Grant NAG 8-931. 




1. Introduction 

Although turbo codes with iterative decoding (1,2,3) have been shown to achieve bit-error 
rates (BER’s) of KT* or better at SNR’s within 1 dB of the SNR for which the code rate 
equals channel capacity, they suffer from three disadvantages: (1) a large decoding delay 
due to the large block lengths and many decoding iterations required for near capacity 
performance, (2) significant weakened performance at BER’s below 10 _s due to the fact 
that the component codes have relatively poor minimum distances, which manifests itself 
at very low BER’s, and (3) a relatively poor frame error performance. The large decoding 
delay makes turbo codes unsuitable for real time applications such as voice transmission 
and packet communications in high speed networks. The fact that turbo codes do not have 
large minimum distances causes the BER curve to flatten out at BER’s below lO’ 5 . This 
phenomenon is called error floor. Because of the error floor, turbo codes are not suitable 
for applications requiring extremely low BER’s, such as some scientific or command and 
control applications. Poor frame error performance is due to the fact that turbo decoding 
is devised to minimize bit error probability not the frame error probability. Even though a 
decoded block may contain very few errors, it is still an erroneous block. Poor frame error 
performance also makes these codes not suitable for many communication applications where 
reliable frame transmission is required. There are measures that can be taken to mitigate 
the error floor and poor frame error performance problems. One such measure is to use 
a powerful Reed-Solomon (RS) outer code in concatenation with a turbo inner code in a 

proper way. 

In this paper, we present an interactive concatenated turbo coding system in which an 
RS outer code is concatenated with a high rate binary turbo inner code, and the outer code 
decoder and the inner turbo code decoder interact to achieve both good bit-error and frame- 
error performances. The inner turbo decoder consists of two component decoders which 
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operate in parallel mode. The two component decoders process their inputs simultaneously. 
At the completion of a decoding phase, their decoded outputs (log-likehood ratios and hard- 
decisions of the decoded binary symbols) are compared. When the comparison satisfies a 
certain criterion, the inner turbo decoder stops its decoding iteration and the outer code 
decoder takes over and completes the decoding process. If the outer code decoding is not 
successful (i.e., a decoding failure), the outer code decoder instructs the inner turbo de- 
coder to continue its decoding iterations until the symbol errors at the input of the outer 
decoder is reduced within the error correction capability of the outer code. The interactive 
process continues until either the outer decoding is successful or a preset maximum number 
of decoding iterations for the inner turbo decoder is rearched. In the latter case, the outer 
code decoder computes the reliability values of its input symbols based on the soft output 
information (log-likehood ratios of the decoded bits) of inner turbo code decoder and carries 
out a reliability-based soft-decision decoding algorithm. 

Also presented in this paper are a new stopping criterion for the inner turbo decoding 
and a new reliability-based algorithm for decoding nonbinary block codes. The new stopping 
criterion with the aid of outer code decoding effectively terminates the turbo decoding process 
with negligible degradation in error performance compared with the cross entropy (CE) 
stopping criterion (3). It provides a significant reduction in the number of decoding iterations 
and hence reduces decoding delay. The new reliability-based decoding algorithm is devised 
by combining the Chase-2 decoding algorithm [4] and the generalized minimum distance 
(GMD) decoding algorithm (5). This decoding algorithm provides a good trade-off between 
error performance of the Chase-2 algorithm and decoding complexity of the GMD algorithm. 

Simulation results show that the proposed concatenated turbo coding system with the 
new stopping criterion for the inner turbo decoding and the new reliability-based algorithm 
for decoding the outer RS code achieves both good bit error and frame error performances 
and reduces decoding delay. 
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2. Turbo Codes, Parallel Turbo Decoding and Bit 
Matching Stopping Criterion 

A turbo code with linear block codes as component codes is obtained by parallel concate- 
nation of two systematic linear block codes with a pseudo random interlever II between two 
encoders as shown in Figure 1. Assume that two component codes are identical and both are 
binary linear block codes. Let u = be the information sequence to 

be encoded for transmission where K = The first encoder encodes this sequence and 
produces a block of f(n, - k t ) parity-check bits, denoted p<*>. The interlever 11 permutes the 
information sequence u into a sequence u' = Il(u). The second encoded encodes u' and pro- 
duces a block of l(ni - Jb.) parity-check bits, denoted p< 7) . Then the sequence (u, p (1 \ p (J) ) is 
the code sequence for the information sequence u. The collection of 2 K such code sequences, 
one for each information sequence u, form a turbo code of length N = /(2n, - *,). Since the 
component codes are block codes, it is called a block turbo code. 

The decoder for a turbo code with two component codes consists of two soft-input/soft- 
output (SISO) MAP (or APP) decoders which operate iteratively (1,2,3). Since the two 
component codes are identical, the two MAP decoders are identical. Decoding can be earned 
out in either serial mode or parallel mode as shown in Figures 2(a) and 2(b), respectively. 
The serial decoding mode was originally proposed by Berrou et. al. (1), and the parallel 
decoding mode was later proposed by Divsalar and Pollara [6]. In serial mode, decoder 1 and 
decoder 2 , denoted DEC1 and DEC2, respectively, operates alternately. In parallel mode, 
the two MAP decoders, DEC1 and DEC2, operate simultaneously. Decoding consists of a 
sequence of iterations, each decoding iteration consists of two phases. In serial mode, DECl 
operates in the first phase and DEC2 operates in the second phase as shown in Figure 2(a). 
However in parallel mode, both DECl and DEC2 operates in each phase as shown in Figure 

2(b). 
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In the proposed interactive concatenated turbo coding system, inner turbo decoding is 
performed in parallel mode. Suppose a code sequence (u,pW,pW) is transmitted. Let 
r = (r 0 .n,‘ • • ,r^r-i) be the received sequence. For decoding, this received sequence is de- 
composed into two subsequences r (,) = (rj ,, ,r{ l, l --) and r (,) = (r£ ,) ,r ( l J) ,- • •), corresponding 
to code sequences (u,p (l) ) and (u',p (,) ) at the outputs of two component encoders, respec- 
tively. Each SISO decoder has two inputs and two outputs as shown in Figure 2(c). The 
inputs to each decoder are the a priori L - values (log-likelihood values) L(u,) for all informa- 
tion bits u, and the received channel L values L c r , for all code bits, where L e = 4 aE,/No and 
E,/No is the channel SNR. For a fading channel, a denotes the fading amplitude, whereas 
for an AWGN channel, a = 1. Based on its inputs, the SISO decoder computes L-values 
(soft outputs) 


» a j, . . . p(ui = l|r) 

l( U| )^u,|r) = l° 8 — =Q|r) , 

for all information bits and delivers an extrinsic L - value L«(u,) for each information bit which 
contains the reliability information from all other coded bits in the code sequence and is not 
infiuenced by L(u,) and L c r, of the current bit ti,-. 

Consider turbo decoding in parallel mode. At the first iteration, DECl and DEC2 start 
decoding at the same time. The inputs to DECl and DEC2 are channel L-values L c r^ and 

/a\ 

L c r\ \ respectively. For equally likely information bits, the a priori L-value L(u,) inputs 
to both SISO decoders in the first phase of the first iteration are zero. Hence, we set 
Z/I’^u.) = 0 and L* J l(u,) = 0 for each information bit. The outputs of DECl and DEC2 are 
L - values L (,) (u.) and L (J) (u,) and extrinsic L - values L^(u,) and L^(ui), respectively, with 


0 < i < K. Then the second phase starts. The inputs to DECl are channel L-values L c r (1 * 
and extrinsic values L^\ui )' s which are the outputs of DEC2 in the first decoding phase. 
The inputs to DEC2 are L e r (J) and L^u^’s which are the outputs of DECl in the first 
decoding phase. The second decoding phase is then performed. All the subsequent iterations 
are carried out in the same manner as the first iteration except that the a priori L - values 
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Z,0>(u,) and L< J) (u,) of each information bit to the inputs of the two SISO decoders at the 
first decoding phase are the extrinsic L-values !«(*) and ifK*). respectively, which are 
the outputs of the two decoders in the second decoding phase of the previous iteration. 

After a sufficient number of iterations (or decoding phases), we can stop the decoding 
process and obtain the L- value for each information bit as follow: 

L(u,)={ . Kl) 

y L^\ui), otherwise, 

where /,<»>(«<) = L c • r, U) + £<*>(«.) + W). : ’W = ^ + + 

Finally, the hard-decision decoded information bit u, is made based on 

. Jo, if L(ui) < 0 
U ' “ \ 1, if L(ui) > 0, 

for 0 < i < K- 

As the iterative decoding approaches the performance limit of a given turbo code, any 
further iteration results in very little improvement in performance. Therefore it is important 
to devise an efficient criterion to stop the the iteration process and prevent unnecessary 
computations and decoding delay. Several stopping criteria have been devised (3,7]. Both 
the sign change and bit matching criteria proposed in (7] are more computationally efficient 

than the cross entropy (CE) criterion proposed in (3j. 

The bit matching (BM) criterion of (7] can be applied to terminate decoding in parallel 

mode in a straightforward manner. At the j-th decoding phase of the *-th iteration for 
j _ j 2 and k = 1,2, we check the hard decisions based on the T-values and 

L< 2 >(u,) generated by DECl and DEC2, respectively, for each information bit. If these 
hard decisions agree with each other for all the information bits in the whole sequence, we 
terminate the decoding process at the ;-th phase of the k - th iteration. 

At each phase, the BM criterion requires 2 K binary operations to make bard decisions 
based on L^(tu) and «d K logic operations to check whether the BM criterion is 
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satisfied. However to test the CE criterion at each iteration, it requires a total of bK - l 
real number operations, including 2 K - 1 additions and subtractions, 2 K multiplications 
and divisions, and K exponentiations. Therefore, the BM criterion requires much simpler 
computations than the CE criterion. 

Simulation results show that the BM criterion saves more iterations than the CE criterion 
with negligible degradation in error performance. Consider the turbo code with the (64,57) 
distance-4 extended Hamming code as the two component codes and a block interleaver of 
size K = 57 x 57. The error performances of decoding in parallel mode with BM stopping 
criterion and serial mode with CE stopping criteria are shown in Figure 3(a). We see that 
decoding in parallel mode with BM stopping criterion outperforms decoding in serial mode 
with CE stopping criterion. The average numbers of decoding iterations required using BM 
and CE criteria, respectively, for parallel mode decoding of the above turbo code are shown 
in Figure 3(b). We see that the BM stopping criterion saves more decoding iterations than 
the CE stopping criterion and hence reduces computational complexity. From Figure 3(a), 
we also see that the frame error performance is relatively poor compared with the bit error 
performance. The error floor starts at frame error probability of 10 -J . This error floor will 
be removed when the proposed concatenated turbo system is used. 

3. Chase- GMD Decoding Algorithm 

RS codes are commonly decoded with an algebraic decoding algorithm, such as the Euclidean 
Algorithm, in applications for keeping the decoding complexity low. To improve the error 
performance, soft-decision decoding must be used. However, soft-decision decoding of RS 
codes significantly increases the decoding complexity. One approach to improve the perfor- 
mance of algebraic decoding while keeping low decoding complexity is to use an algebraic 
decoder to generate a sequence of candidate codewords based on the reliability values of 
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the received symbols, and then choose the candidate codeword with the best metric as the 
decoded codeword. The two most well known such decoding algorithms are GMD algorithm 
(5] and Chase-2 algorithm [4). Both algorithms improve the error performance of algebraic 
decoding. For a RS code over GF(?) with minimum distance d , GMD algorithm requires to 
perform at most [(<f+ 1)/2J algebraic decodings while Chase-2 algorithm needs to perform 
algebraic decodings based on qW 7 l test error patterns with errors confined to [df 2J 
]east reliable positions of the received sequence. Chase-2 algorithm outperforms GMD al- 
gorithm, however it requires much more decoding computations. For long RS codes over 
large field GF(«?) with large minimum distance d, Chase-2 algorithm becomes impractical. 
The GMD decoding algorithm while simple gives only small improvement in error perfor- 
mance over pure algebraic decoding for small to medium SNR s, especially for long RS codes. 
Therefore, GMD is not very attractive for practical applications for small to medium SNR’s 
and therefore, it must be improved. 

In this section, we present a decoding algorithm which combines Chase-2 and GMD 
algorithms. It provides a good trade-off between the error performance of Chase-2 algorithm 
and the decoding complexity of GMD algorithm. We call this decoding algorithm Chase- 
GMD algorithm. 

Consider an ( n 0 y k o ,d ) RS code over GF(g) with q = 2 m . Let x = (xo, Xi, • • • ,x„ e _i) be 
a codeword. For binary transmission, every code symbol x, is expanded into a binary m- 
tuple. Let y = (yo,yj» ■ • ■ ,yn.-i) be the unquantized received sequence at the output of the 
matched filter in the receiver, where y< represents a vector (y,- .Oiyi.it • composed of 

m real numbers. Let z = (zo, Zj, • • • , z n .-i) be the hard-decision received sequence obtained 
from y with z, in GF(2 m ). A real number or; is assigned to each hard-decision received 
symbol z; to indicate its reliability. There are a number of ways to define a.’s [5,8]. For 
the proposed concatenated turbo coding system, since inner turbo decoding not only gives 
the hard-decision of each information bit but also provides its reliability £-value. Based on 
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the bit reliability values, we can easily compute the reliability value of each hard-decision 
received symbol z,. Let (x,.o, *.,»,• • • be the binary m-tuple expansion of code symbol 

x,. With respect to inner encoding, each x.j with 0 < ; < m, is an information bit. The 
inner turbo decoding provides each bit x tJ a reliability L-value, L(iij). Then the reliability 
value of the i-th hard-decision received symbol z, is 

m-1 


where 


£ i L(r t j ) 

= **> = r+^3i’ 


for 0 < i < n 0 and 0 < j < m. The larger a,, the more reliable z, is. 

Now we describe the Chase-GMD algorithm. Without loss of generality, we assume that 
the hard-decision received symbols in z are ordered in the order of increasing reliability such 
that a,- < Oj for » < j. We also assume that an error-and-erasure algebraic decoder [5] is 
used to generate candidate codewords which corrects e errors and s erasures provided that 
s + 2 1 < d. For 0 < P < |d/2J, let E denote the set of q p test error patterns with errors 
(nonzero components) confined to the P least reliable positions. Let CGA(P) denote the 
Chase-GMD algorithm with parameter P. This CGA(P) processes all the vectors w = z + e 
with e in E. Define the following set of integers: 

/ ( P ) = {» : 0 < i < d — 2 P — 1 and d — i is odd}. 

For each w and each integer t € I{P), erase i symbols of w starting from symbol position 
P + 1 to symbol position P + 1 . This results in a vector w* with i erasures. Perform error- 
and-erasure decoding on w*. If decoding is successful, the decoded codeword is a candidate 
codeword. After performing 


/U(d+1)/2J-P) 


( 2 ) 
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decodings, we obtained a set of candidate codewords. Among these candidate codewords, 
the one with the best metric is the decoded codeword. For P = 0, CGA(O) is simply the 
GMD algorithm and for P = [d/2], CGA (P) is simply the Chase-2 algorithm. It can be 
proved that the performance of CGA(P) improves as P increases [9]. 

The computational complexity of CGA(P) is between those of GMD and Chase-2 algo- 
rithms. However, for large q and «f, the number of decoding given by (2) is still very dis- 
couraging for practical applications. We may modified this algorithm for further reduction 
in computational complexity. For 0 < » < P, compute the conditional probabilities p(y,|y) 
for every g in GF (<?). Let /*,(<?') denote the set of q' symbols in GF(<?) that give the q' largest 
conditional probabilities p(y,\g). Let E 1 denote the set of test error patterns with nonzero 
components confined to the first P positions and chosen from A,(g') for 0 < i < P. There 
are q' p error patterns in £'. In the modified CGA(P) algorithm, we use the error pattern in 
£• to generate candidate codewords by decoding w' = z + e' with e' in £' and i erasures in 
w' starting from symbol position P + 1. Denote this modified algorithm with CGA(P.g')- 
The total number of algebraic decodings required by CG\(P,q') is q' P {[{d + 1)/2J - P). It 
is clear that for P = 0, CGA(<W) is still the GMD algorithm. 

Consider the decoding (31,25,7) RS code over GF(2 S ). Then q = 32. Suppose we chose 
q> — 2. For P = 0, 1,2 and 3, the bit error and block error performances of CGA(P, 2) are 
shown in Figure 4(a). For comparison, the performance of pure algebraic decoding is also 
included. We see that CGA(P,2) for P = 1,2 and 3 outperforms GMD algorithm (P=0) and 
pure algebraic decoding. At BER=10" 4 , CGA(3,2) achieves a 0.7 dB coding gain over GMD 
algorithm and a 0.9 dB coding gain over pure algebraic decoding at a cost of 8 decoding 
trials while GMD algorithm requires 4 decoding trials. The bit and block error performances 
of the (255, 223) NASA standard RS code over GF(2*) are shown in Figure 4(b). 

CGA(P,g') will be used in our proposed concatenated turbo coding system for decoding 

outer code. 
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4. A Concatenated Turbo Coding System 

In this section, we describe the proposed concatenated turbo coding system in which the 
inner turbo decoding is performed in parallel mode as described in Section 2 and the outer 
RS code is decoded with both algebraic and CGA(P, q') algorithms. 

The concatenated system is shown in Figure 5(a). The inner code is a turbo code with two 
identical (n,, k t ) block component codes C, and the outer code is an (n„ k 0 ) RS (or shortened 
RS) code C 0 over GF(2 m ) with minimum distance d. For binary transmission, each code 
symbol in GF(2 m ) is expanded into a binary m-tuple, called an m bit byte. Two types of 
concatenations are proposed: two dimensional and three dimensional concatenations. Both 
encoding and decoding are carried out in two stages. 

4.1. Encoding 

An information sequence of A mk 0 bits is segmented into a sequence of Xk 0 m bit bytes. Each 
m-bit byte is regarded as symbol in GF(2 m ). This sequence of Xk g bytes is arranged as a 
Xxk 0 array U as shown in Figure 5(b), each column consists of A bytes and each row consists 
of k 0 bytes. At the first stage of encoding, each row of U is encoded into a RS codeword 
in C 0 . This results in A RS codewords arranged in a A x n a array, denoted V lt as shown in 
Figure 5(c). 

Consider the two dimensional concatenated coding scheme. Let S be the integer such 
that Ski = Xmn e . The interleaver between the outer encoder and the inner encoder reads 
the array V, column by column and write row by row into 6 x Jt, array, denoted V 2 , in bit 
form as shown in Figure 5(d). Each column consists of S bits and each row consists of k, 
bits. At the second stage of encoding, the array V 2 is encoded into a S x (2 n, - Jt,-) array V 3 
with turbo encoding as shown in Figure 5(e). The turbo encoding is carried out as described 
in Section 2. The array V 3 consists of three subarrays, the information array V 2 , the parity 
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arrays P, and Pj. Each row of V 3 is a turbo codeword which consists of it, information 
bits and two parts of parity check bits, each consists of n, - parity check bits. V 3 is a 
concatenated turbo codeword. V 3 is transmitted row by row. The rate of the concatenated 
turbo code is 

R Xmkt 

Amn* + 2Xmn 0 (rn - ki)/ki ' 

If a three dimensional concatenated coding scheme is used, an integer 6' is chosen such 
that S'ki = An 0 . The array Vj is read column by column and write row by row into m S' x k, 
arrays, denoted Vj 1 ', vj,^, • • • , V 3 m ', in bit form as shown in Figure 5(f) (bit demultiplexing). 
The i-th bit of each rn-bit byte is put into array V^, 1 < * < m. At the second stage of 
encoding, each array is encoded into a 6' x (2n, - k() array vj? with turbo encoding, 
1 < i < m. Each array V 3 ^ also consists of three subarrays, the information array V^, the 
parity arrays P^ and P^ as shown in Figure 5(g). The three dimensional concatenated code 
has the same code rate as the two dimensional concatenated code. 

4.2. Decoding 

First, consider the two dimensional concatenated coding scheme. Let R be the received 
array corresponding to V 3 . It is then decoded in two stages, the inner decoding and outer 
decoding. At inner decoding, R is decoded with turbo decoding in parallel mode as described 
in Section 2. At end of each phase of a decoding iteration, the two component decoders of the 
turbo decoder, DECl and DEC2, produce two decoded information arrays, Vj 1 * and V 3 l \ 
along with the reliability L-values of decoded bits and extrinsic values. The two estimated 
arrays and are compared. Using BM stopping criterion, if and V^ J) match in 
every bit position, the turbo decoding iteration process can be stopped. 

If two corresponding bits do not match at a certain bit position in V^ l) and v£ 5) , a 
hard-decision at this position based on Z(u {1) ) and L(uW) given in (1) is likely to result 
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in an error. Suppose Vj^ and do not match in ail bit positions, we rearrange them 
into arrays V ( , l) and V{ ,J corresponding to the RS code array Vi shown in Figure 5(c). The 
mismatched bit positions in and will result in mismatched symbol positions in Vj 1 * 
and V, \ Now we compare the corresponding rows of V{ 1) and v}** and check how many 
symbol positions where two corresponding symbols do not match. Hard decisions at these 
symbol positions are likely to result in symbol errors. If the number of mismatched symbol 
positions for each pair of corresponding rows in V{ 1) and is less than or equal to the 
error correcting capability t = [(d - 1)/2J of the outer code and if symbol errors resulting 
from hard decisions are only confined in these mismatched positions, then the outer RS 
code can be used to correct these errors. Based on this reasoning, we now can formulate a 
criterion for stopping the inner turbo decoding iteration and let the outer decoder to remove 
the remaining errors (if any). 

Symbol Matching (SM) Stopping Criterion: Compare row by row V* 1 * and 
If the number of mismatched symbol positions for each pair of corresponding 
rows is less than or equal to [(d - 1 )/2J , then stop the inner turbo decoding 
iteration. 

When the inner turbo decoding is stopped based on the SM criterion, hard decisions are 
made at the mismatched positions in V{ 1) and based on the l values /.(V^) and 
This results in an estimated array V[. Then second decoding stage starts. The 
outer decoder decodes each row of VJ based on an algebraic decoding algorithm, such as the 
Euclidean algorithm. If each row of V,* is decoded successfully, decoding is done. The parity 
check symbols are removed from each decoded RS codeword and the decoded information 
symbols are then delivered to the user. If not all the rows of Vj are decoded successfully, 
the outer decoder instructs the inner turbo decoder to resume decoding iteration from the 
phase where it was stopped. Then the above inner/outer decoding process continues until 
either all the rows of Vj are decoded successfully or the inner decoding reaches a maximum 
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□umber / m4 , of iterations. For the latter case, the outer decoder decodes VJ based on & 
CGA(P,g') algorithm and stops. 

The above decoding process is illustrated by the flow chart shown in Figure 6. 

The SM criterion is a very effective stopping criterion. It requires only simple binary or 
logic operations. At the end of each decoding phase, it requires 2 K = 26k, binary operations 
to make hard decisions to form estimated arrays, V| l * and V{ J) , and K bit-comparisons to 
compute the numbers of mismatched symbol positions for all pairs of corresponding rows in 
Vj ,) and V ( , J) . 

The decoding for the three dimensional concatenated code is similar to the two dimen- 
sional code. At the receiver, there are m received arrays, RW, R^, • • • , Rj m l corresponding 
to vSV.jVW respectively. It is also decoded in two stages, inner decoding and 
outer decoding. At inner decoding, each RM, 1 < t < m, is decoded with turbo decoding 
individually. This allows us to use m identical turbo decoders in parallel, each for decoding 
one received array, in order to speed up decoding process. At end of each phase of a decoding 
iteration for all turbo decoders, the SM stopping criterion is used to terminate the iterative 
process of inner turbo decoding. When the SM criterion is satisfied, hard decisions are made 
by all m decoders. These hard decisions form An 0 m-bit bytes such that the i-th bit of each 
byte is from the hard decision of the *-th decoder, 1 < * < m. These An„ m-bit bytes form 
the estimated array VJ. Then, the second decoding stage starts as the same as the two 
dimensional concatenated code. 

5. Examples and Simulation Results 

The proposed interactive concatenated turbo coding system has been simulated for both 
AWGN and Rayleigh channels. Simulation results show that this coding system achieves 
both good bit-error and frame error performances without error floor. Furthermore, the SM 
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stopping criterion effectively terminates the inner turbo decoding iteration and shortens the 
inner decoding delay, 

Consider an example in which the (22$, 212) shortened RS code over GF(2*) is used 
as the outer code and the (64,57) distance-4 extended Hamming code is used as the two 
component codes for constructing the inner turbo code. We choose A = 4. Then 6 = 128. 
The rate of this system is fl=0.75. The bit-error and frame-error performances of the two 
dimensional scheme of this system for AWGN channel are shown in Figures 7(a). We see the 
waterfall performance without error floor. For the Rayleigh channel without channel side 
information, the bit-error and frame-error performances are shown in Figure 7(b). Again 
they display waterfall error performance. Form Figure 7(a), we see that for AWGN channel 
and at BER=10~<, the proposed iterative decoding achieves a 5.5 dB coding gain over the 
uncoded BPSK system which is only 1.2 dB away from the Shannon limit for rate R = 0.75. 

The bit-error performance of the three dimensional scheme of this system is also shown 
in Figure 7(a). From Figure 7(a), we see that the three dimensional concatenated coding 
scheme has better performance than the two dimensional scheme at low SNR and the gap 
between those two schemes is small at high SNR. In the three dimensional concatenated 
coding scheme, m bits in each code symbol over GF(2 m ) are decoded by m turbo decoders 
independently. This results in a better error performance at low SNR compared with the 
two dimensional code. Furthermore, it allows us to use m identical turbo decoders to decode 
m received arrays in parallel which increases the decoding speed by a factor close to m. 

To show the effectiveness of the SM criterion, we stop the inner turbo decoding iterations 
with four stopping criteria: (1) a fixed number of iterations=10, (2) the CE criterion, (3) the 
BM criterion, and (4) the SM criterion. The maximum number of decoding iterations, 
for the last 3 criteria is also set to 10. Tables 1(a) and 1(b) display: (1) the average numbers 
of iterations required for each criterion, and (2) the number of blocks in errors after inner 
decoding, outer algebraic decoding, and CGA(1,2) decoding of 1,000 blocks. Tables 1(a) and 


14 



1(b) are for SNR's 2.9 and 3.0 dB, respectively. Consider SNR=3.0 dB. From Table 1(b), 
we see that SM criterion saves 5.72 iterations while BM and CE criteria save 4.74 and 2.37 
iterations, respectively, on average compared with / m41 = 10. After inner decoding, if SM 
criterion is used, there are 147 RS words in error while using CE criterion, there are only 
12 RS words in error. However, after either algebraic or CGA(1,2) outer decodings, there is 
no RS word in error. Finally, Figure 8 displays the average numbers of decoding iterations 
required with SM and CE criteria, respectively, for the example system over AWGN channel. 
From Figures 3, 7(a) and 8, we see that SM criterion is more effective than the CE criterion. 
It reduces the number of decoding iterations with very little performance degradation. 

Consider another example in which the (255,223) RS code over GF(2 8 ) is used as outer 
code and the (32,16) RM code with minimum distance 8 is used as the two component codes 
for constructing the inner turbo code. We choose A = 4, then 6 = 510. The rate of this 
system is R = 0.29. The bit-error and frame-error performances of the two dimensional 
scheme of this system for AWGN channel are shown in Figure 9. We see that at BER=10 -,< , 
the proposed decoding achieves a 7.3 dB coding gain over the uncoded BPSK system which 
is 1.75 dB away from the Shannon limit for rate R = 0.29. 

6. Conclusion 

In this paper, we have presented a high performance concatenated turbo coding system 
in which the inner and outer decoders interact to achieve good error performance. An 
effective criterion for stopping inner turbo decoding iteration has been proposed, which is 
more effective than the CE criterion in reducing the number of decoding iterations. Also 
presented in this paper is a reliability -based decoding algorithm which provides a good trade- 
off between the error performance of Chase-2 algorithm and complexity of GMD decoding 
algorithm. 
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Table 1: Comparisons of different stopping criteria 
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(a) SN'R=2.9 dB 
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(b) SNR=3.0 dB 



Figure 1: A block diagram of the turbo encoder with two component codes 
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Figure 2: Serial and parallel decoding structures of turbo codes 
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(b) 

Figure 3: Error performances and numbers of iterations of parallel and serial turbo decodings 
of the (64,57) Hamming code 
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Figure 4: Bit and block error performances of Chase-GMD decoding of the (31,25,7) and 
(255,223,33) RS codes 
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(a) Block diagram of a concatenated turtx) coding system 



Figure 5: Two and three dimensional concatenated turbo codes 
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Figure 6: Flow chart for iterative decoding of concatenated turbo codes 
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Error Probability 



Figure 7: (a) Bit and frame error performances of iterative decoding of the concatenated 
turbo code with the (228,212) shortened RS code and the (64,57) extended Hamming code 
as outer code and component codes of inner turbo code over AWGN channel 
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Figure 7: (b) Bit and frame error performances of iterative decoding of the concatenated 
turbo code with the (228,212) shortened RS code and the (64,57) extended Hamming code 
as outer code and component codes of inner turbo code over Rayleigh fading channel 


25 





Error Probability 



Figure 9: Bit and frame error performances of iterative decoding of the concatenated turbo 
code with the (255,223) RS code and the (32,16) RM code as outer code and component 
codes of inner turbo code over AWGN channel 
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